control value
X2C: A Dataset Featuring Nuanced Facial Expressions for Realistic Humanoid Imitation
Li, Peizhen, Cao, Longbing, Wu, Xiao-Ming, Yang, Runze, Yu, Xiaohan
The ability to imitate realistic facial expressions is essential for humanoid robots engaged in affective human-robot communication. However, the lack of datasets containing diverse humanoid facial expressions with proper annotations hinders progress in realistic humanoid facial expression imitation. To address these challenges, we introduce X2C (Anything to Control), a dataset featuring nuanced facial expressions for realistic humanoid imitation. With X2C, we contribute: 1) a high-quality, high-diversity, large-scale dataset comprising 100,000 (image, control value) pairs. Each image depicts a humanoid robot displaying a diverse range of facial expressions, annotated with 30 control values representing the ground-truth expression configuration; 2) X2CNet, a novel human-to-humanoid facial expression imitation framework that learns the correspondence between nuanced humanoid expressions and their underlying control values from X2C. It enables facial expression imitation in the wild for different human performers, providing a baseline for the imitation task, showcasing the potential value of our dataset; 3) real-world demonstrations on a physical humanoid robot, highlighting its capability to advance realistic humanoid facial expression imitation. Code and Data: https://lipzh5.github.io/X2CNet/
CAK: Emergent Audio Effects from Minimal Deep Learning
We demonstrate that a single 3x3 convolutional kernel can produce emergent audio effects when trained on 200 samples from a personalized corpus. We achieve this through two key techniques: (1) Conditioning Aware Kernels (CAK), where output = input + (learned_pattern x control), with a soft-gate mechanism supporting identity preservation at zero control; and (2) AuGAN (Audit GAN), which reframes adversarial training from "is this real?" to "did you apply the requested value?" Rather than learning to generate or detect forgeries, our networks cooperate to verify control application, discovering unique transformations. The learned kernel exhibits a diagonal structure creating frequency-dependent temporal shifts that are capable of producing musical effects based on input characteristics. Our results show the potential of adversarial training to discover audio transformations from minimal data, enabling new approaches to effect design.
Barrier Function Overrides For Non-Convex Fixed Wing Flight Control and Self-Driving Cars
Squires, Eric, Odom, Phillip, Kira, Zsolt
Reinforcement Learning (RL) has enabled vast performance improvements for robotics systems. To achieve these results though, the agent often must randomly explore the environment, which for safety critical systems presents a significant challenge. Barrier functions can solve this challenge by enabling an override that approximates the RL control input as closely as possible without violating a safety constraint. Unfortunately, this override can be computationally intractable in cases where the dynamics are not convex in the control input or when time is discrete, as is often the case when training RL systems. We therefore consider these cases, developing novel barrier functions for two non-convex systems (fixed wing aircraft and self-driving cars performing lane merging with adaptive cruise control) in discrete time. Although solving for an online and optimal override is in general intractable when the dynamics are nonconvex in the control input, we investigate approximate solutions, finding that these approximations enable performance commensurate with baseline RL methods with zero safety violations. In particular, even without attempting to solve for the optimal override at all, performance is still competitive with baseline RL performance. We discuss the tradeoffs of the approximate override solutions including performance and computational tractability.
ExFace: Expressive Facial Control for Humanoid Robots with Diffusion Transformers and Bootstrap Training
Zhang, Dong, Peng, Jingwei, Jiao, Yuyang, Gu, Jiayuan, Yu, Jingyi, Chen, Jiahao
-- This paper presents a novel Expressive Facial Control (ExFace) method based on Diffusion Transformers, which achieves precise mapping from human facial blend-shapes to bionic robot motor control. By incorporating an innovative model bootstrap training strategy, our approach not only generates high-quality facial expressions but also significantly improves accuracy and smoothness. Experimental results demonstrate that the proposed method outperforms previous methods in terms of accuracy, frame per second (FPS), and response time. Furthermore, we develop the ExFace dataset driven by human facial data. ExFace shows excellent real-time performance and natural expression rendering in applications such as robot performances and human-robot interactions, offering a new solution for bionic robot interaction. Facial expressions are integral to human communication, playing a pivotal role in the transmission of emotions, attitudes, and intentions. As evidenced in prior research, individuals rely on a variety of facial expressions to both convey and interpret affective states [1].
Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs
Zhou, Shang, Yao, Feng, Dong, Chengyu, Wang, Zihan, Shang, Jingbo
Controlling the attribute intensity of text generation is crucial across scenarios (e.g., writing conciseness, chatting emotion, and explanation clarity). The remarkable capabilities of large language models (LLMs) have revolutionized text generation, prompting us to explore such \emph{smooth control} of LLM generation. Specifically, we propose metrics to assess the range, calibration, and consistency of the generated text's attribute intensity in response to varying control values, as well as its relevance to the intended context. To quantify the attribute intensity and context relevance, we propose an effective evaluation framework leveraging the Elo rating system and GPT4, both renowned for their robust alignment with human judgment. We look into two viable training-free methods for achieving smooth control of LLMs: (1) Prompting with semantic shifters, and (2) Modifying internal model representations. The evaluations of these two methods are conducted on $5$ different attributes with various models. Our code and dataset can be obtained from \url{https://github.com/ShangDataLab/Smooth-Control}.
Controlling Pre-trained Language Models for Grade-Specific Text Simplification
Agrawal, Sweta, Carpuat, Marine
Text simplification (TS) systems rewrite text to make it more readable while preserving its content. However, what makes a text easy to read depends on the intended readers. Recent work has shown that pre-trained language models can simplify text using a wealth of techniques to control output simplicity, ranging from specifying only the desired reading grade level, to directly specifying low-level edit operations. Yet it remains unclear how to set these control parameters in practice. Existing approaches set them at the corpus level, disregarding the complexity of individual inputs and considering only one level of output complexity. In this work, we conduct an empirical study to understand how different control mechanisms impact the adequacy and simplicity of text simplification systems. Based on these insights, we introduce a simple method that predicts the edit operations required for simplifying a text for a specific grade level on an instance-per-instance basis. This approach improves the quality of the simplified outputs over corpus-level search-based heuristics.
Robust Autonomous Vehicle Pursuit without Expert Steering Labels
Pan, Jiaxin, Zhou, Changyao, Gladkova, Mariia, Khan, Qadeer, Cremers, Daniel
In this work, we present a learning method for lateral and longitudinal motion control of an ego-vehicle for vehicle pursuit. The car being controlled does not have a pre-defined route, rather it reactively adapts to follow a target vehicle while maintaining a safety distance. To train our model, we do not rely on steering labels recorded from an expert driver but effectively leverage a classical controller as an offline label generation tool. In addition, we account for the errors in the predicted control values, which can lead to a loss of tracking and catastrophic crashes of the controlled vehicle. To this end, we propose an effective data augmentation approach, which allows to train a network capable of handling different views of the target vehicle. During the pursuit, the target vehicle is firstly localized using a Convolutional Neural Network. The network takes a single RGB image along with cars' velocities and estimates the target vehicle's pose with respect to the ego-vehicle. This information is then fed to a Multi-Layer Perceptron, which regresses the control commands for the ego-vehicle, namely throttle and steering angle. We extensively validate our approach using the CARLA simulator on a wide range of terrains. Our method demonstrates real-time performance and robustness to different scenarios including unseen trajectories and high route completion. The project page containing code and multimedia can be publicly accessed here: https://changyaozhou.github.io/Autonomous-Vehicle-Pursuit/.
Performance Analysis of Universal Robot Control System Using Networked Predictive Control
Networked control systems are feedback control systems with system components distributed at different locations connected through a communication network. Since the communication network is carried out through the internet and there are bandwidth and packet size limitations, network constraints appear. Some of these constraints are time delay and packet loss. These network limitations can degrade the performance and even destabilize the system. To overcome the adverse effect of these communication constraints, various approaches have been developed, among which a representative one is networked predictive control. This approach proposes a controller, which compensates for the network time delay and packet loss actively. This paper aims at implementing a networked predictive control system for controlling a robot arm through a computer network. The network delay is accounted for by a predictor, while the potential of packet loss is mitigated using redundant control packets. The results will show the stability of the system despite a high delay and a considerable packet loss. Additionally, improvements to previous networked predictive control systems will be suggested and an increase in performance can be shown. Lastly, the effects of different system and environment parameters on the control loop will be investigated.
AI化、5分でチェック・簡単無料PoC ! 5min PoC (Proof of Concept) on AI
But looks hassle if do it your own. Google search shows many advertisements on AI systems but looks hard to learn. Also, even if you decided to do it, if it fails eventually, it will be a waste of time and money. This is just another interpretation of previously shown temperature control program I presented. Simply, just input your data onto the screen, the click button couple of times until you see convergence.